Sliced Inverse Regression for Lifetimes and a Remark on High-Dimensional Graphical Models
نویسندگان
چکیده
When analyzing multivariate data, one can appeal to the procedures of dimension reduction to describe its main features in the easiest way possible. In this thesis we work with one such methods, the sliced inverse regression (SIR), and propose a new adaptation to survival data. A popular idea to account for censoring is to reweight the observed data points, often with the help of inverse probability weighting. We base our strategy on the estimation of the unobserved information. Our idea is tested with different distributions for the two main survival data models, Accelerated Lifetime Model and Cox’s proportional hazards model. In both cases and under different conditions of sparsity, sample size and dimension of parameters, this non-parametric approach evaluates the data structure successfully and can be viewed as a variable selector. We also compare our method with other existing techniques and find it to be competitive. In the second part of the thesis, we concentrate on the problem of detection of a partial correlation. The ability to identify reliably a positive or negative partial correlation between the expression levels of two genes is determined by the number p of genes, the number n of analyzed samples, and the statistical properties of the measurements. Classical statistical theory teaches us that the product of the root sample size multiplied by the size of the partial correlation is the crucial quantity. But this has to be combined with some adjustment for multiplicity depending on p, which makes the classical analysis somewhat arbitrary. We investigate this problem through the lens of the Kullback-Leibler divergence, which is a measure of the average information for detecting an effect. As a results, it appears that commonly sized studies in genetical epidemiology are not able to reliably detect moderately strong links.
منابع مشابه
Using Dimension-Reduction Subspaces to Identify Important Inputs in Models of Physical Systems
Graphical methods based on dimension-reduction subspaces for regression problems (Cook 1994) may be useful for studying the relative importance of inputs in computer models of physical systems. Sliced inverse regression (Li 1991), principal Hessian directions (Li 1992), ceres plots (Cook 1993), and inverse response plots (Cook and Weisberg 1994) are recent methods that can identify characterist...
متن کاملOn Sliced Inverse Regression With High-Dimensional Covariates
Sliced inverse regression is a promising method for the estimation of the central dimension-reduction subspace (CDR space) in semiparametric regression models. It is particularly useful in tackling cases with high-dimensional covariates. In this article we study the asymptotic behavior of the estimate of the CDR space with high-dimensional covariates, that is, when the dimension of the covariat...
متن کاملSufficient Dimension Reduction With Missing Predictors
In high-dimensional data analysis, sufficient dimension reduction (SDR) methods are effective in reducing the predictor dimension, while retaining full regression information and imposing no parametric models. However, it is common in high-dimensional data that a subset of predictors may have missing observations. Existing SDR methods resort to the complete-case analysis by removing all the sub...
متن کاملConsistency of regularized sliced inverse regression for kernel models
We develop an extension of the sliced inverse regression (SIR) framework for dimension reduction using kernel models and Tikhonov regularization. The result is a numerically stable nonlinear dimension reduction method. We prove consistency of the method under weak conditions even when the reproducing kernel Hilbert space induced by the kernel is infinite dimensional. We illustrate the utility o...
متن کاملSliced inverse regression for high-dimensional time series
Methods of dimension reduction are very helpful and almost a necessity if we want to analyze high-dimensional time series since otherwise modelling affords many parameters because of interactions at various time-lags. We use a dynamic version of Sliced Inverse Regression (SIR; Li (1991)), which was developed to reduce the dimension of the regressor in regression problems, as an exploratory tool...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013